On Lattices, Learning with Errors, Random Linear Codes, 
and Cryptography 


ODED REGEV 


Tel Aviv University, Tel Aviv, Israel 


Abstract. Our main result is a reduction from worst-case lattice problems such as GAPS VP and SIVP 
to acertain learning problem. This learning problem is a natural extension of the “learning from parity 
with error” problem to higher moduli. It can also be viewed as the problem of decoding from a random 
linear code. This, we believe, gives a strong indication that these problems are hard. Our reduction, 
however, is quantum. Hence, an efficient solution to the learning problem implies a quantum algorithm 
for GAPSVP and SIVP. A main open question is whether this reduction can be made classical (1.e., 
nonquantum). 

We also present a (classical) public-key cryptosystem whose security is based on the hardness 
of the learning problem. By the main result, its security is also based on the worst-case quantum 
hardness of GAPSVP and SIVP. The new cryptosystem is much more efficient than previous lattice- 
based cryptosystems: the pub lic key is of size O(n”) and encrypting a message increases its size by 
a factor of O(n) (in previous cryptosystems these values are O(n*) and O(n’), respectively). In fact, 
under the assumption that all parties share a random bit string of length O(n”), the size of the public 
key can be reduced to O(n). 


Categories and Subject Descriptors: E.3 [Data Encryption]: Public key cryptosystems 
General Terms: Algorithms, Theory 


Additional Key Words and Phrases: Lattice, cryptography, quantum computation, public key encryp- 
tion, average-case hardness 


ACM Reference Format: 


Regev, O. 2009. On lattices, learning with errors, random linear codes, and cryptography. J. ACM 56, 
6, Article 34 (September 2009), 40 pages. 
DOI = 10.1145/1568318.1568324 http://doi.acm.org/10.1145/1568318.1568324 


The work of O. Regev was supported by an Alon Fellowship, by the Binational Science Foundation, 
by the Israel Science Foundation, by the Army Research Office grant DAAD19-03-1-0082, by the 
European Commission under the Integrated Project QAP funded by the IST directorate as Contract 
Number 015848, and by a European Research Council (ERC) Starting Grant. 


Author’s address: O. Regev, School of Computer Science, Tel Aviv University, Tel Aviv 69978, 
Israel. 


Permission to make digital or hard copies of part or all of this work for personal or classroom use 
is granted without fee provided that copies are not made or distributed for profit or commercial 
advantage and that copies show this notice on the first page or initial screen of a display along with the 
full citation. Copyrights for components of this work owned by others than ACM must be honored. 
Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute 
to lists, or to use any component of this work in other works requires prior specific permission and/or 
a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, 
New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or permissions@acm.org. 

© 2009 ACM 0004-541 1/2009/09-ART34 $10.00 

DOI 10.1145/1568318.1568324 http://doi.acm.org/10.1145/1568318.1568324 


Journal of the ACM, Vol. 56, No. 6, Article 34, Publication date: September 2009. 


34:2 ODED REGEV 


1. Introduction 


Main Theorem. For an integer n > 1 and a real number € > 0, consider the ‘learning 
from parity with error’ problem, defined as follows: the goal is to find an unknown 
s € Z3 given a list of “equations with errors’ 


(s, ay) xX; bı (mod 2) 
(s, a2) %e bz (mod 2) 


where the a;’s are chosen independently from the uniform distribution on Z5, 
Sa)= >" j s ;(a;); is the inner product modulo 2 of s and a;, and each equation is 
correct independently with probability 1 — €. More precisely, the input to the prob- 
lem consists of pairs (a;, b;) where each a; is chosen independently and uniformly 
from Z; and each b; is independently chosen to be equal to (s, a;) with probability 
1 — e. The goal is to find s. Notice that the case € = 0 can be solved efficiently by, 
say, Gaussian elimination. This requires O (n) equations and poly() time. 

The problem seems to become significantly harder when we take any positive 
€ >0. For example, let us consider again the Gaussian elimination process and 
assume that we are interested in recovering only the first bit of s. Using Gaussian 
elimination, we can find a set $ of O(n) equations such that }°, a; is (1, 0, ..., 0). 
Summing the corresponding values b; gives us a guess for the first bit of s. However, 
a standard calculation shows that this guess is correct with probability 5 +279, 
Hence, in order to obtain the first bit with good confidence, we have to repeat the 
whole procedure 2°”) times. This yields an algorithm that uses 2°” equations and 
2° time. In fact, it can be shown that given only O(n) equations, the s’ € Z} 
that maximizes the number of satisfied equations is with high probability s. This 
yields a simple maximum likelihood algorithm that requires only O (n) equations 
and runs in time 2°”, 

Blum et al. [2003] provided the first subexponential algorithm for this problem. 
Their algorithm requires only 20“/!°2”) equations/time and is currently the best 
known algorithm for the problem. It is based on a clever idea that allows to find a 
small set S of equations (say, O(./n)) among 2°"/'°8”) equations, such that X` 9 Ai 
is, say, (1, 0,...,0). This gives us a guess for the first bit of s that is correct with 
probability 5 + 27v, We can obtain the correct value with high probability by 


repeating the whole procedure only 2°\V times. Their idea was later shown to 
have other important applications, such as the first 2?“)-time algorithm for solving 
the shortest vector problem [Kumar and Sivakumar 2001; Ajtai et al. 2001]. 

An important open question is to explain the apparent difficulty in finding efficient 
algorithms for this learning problem. Our main theorem explains this difficulty for 
a natural extension of this problem to higher moduli, defined next. 

Let p = p(n) < poly(n) be some prime integer and consider a list of “equations 
with error” 


(s, ay) Ry bi (mod p) 
(s, a2) X, bz (mod p) 
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Probability 
Probability 


a4 =! 


Fic. 1. W, for p = 127 witha = 0.05 (left) and œ = 0.1 (right). The elements of Zp are arranged 
on a circle. 


where this time s € Z}, a; are chosen independently and uniformly from Z’,, and 
bi € Zp. The error in the equations is now specified by a probability distribution 
X:Zp > Rt onz p- Namely, for each equation 7, b; = (s, a;) + e; where each 
ei E Zp is chosen independently according to x. We denote the problem of re- 
covering s from such equations by LWE,,, (learning with error). For example, 
the learning from parity problem with error € is the special case where p =2, 
x(0)=1 — €, and x(1) =e. Under a reasonable assumption on x (namely, that 
x(0) > 1/p + 1/poly(n)), the maximum likelihood algorithm described above 
solves LWE,,, for p < poly(n) using poly(n) equations and 2°” losn) time. Un- 
der a similar assumption, an algorithm resembling the one by Blum et al. [2003] 
requires only 2° equations/time. This is the best known algorithm for the LWE 
problem. 

Our main theorem shows that for certain choices of p and x, a solution to LWE,, , 
implies a quantum solution to worst-case lattice problems. 


THEOREM 1.1. (Informal) Let n, p be integers and a €(0,1) be such that 
ap >2,/n. If there exists an efficient algorithm that solves LWE,,y, then there 
exists an efficient quantum algorithm that approximates the decision version of the 
shortest vector problem (GAPSVP) and the shortest independent vectors problem 
(SIVP) to within O(n/a) in the worst case. 


The exact definition of Ya will be given later. For now, it is enough to know that 
it is a distribution on Z, that has the shape of a discrete Gaussian centered around 
O with standard deviation ap, as in Figure 1. Also, the probability of O (i.e., no 
error) is roughly 1/(ap). A possible setting for the parameters is p = O(n”) and 
a=1/(/n log” n) (in fact, these are the parameters that we use in our cryptographic 
application). 

GAPSVP and SIVP are two of the main computational problems on lattices. In 
GAPS VP, for instance, the input is a lattice, and the goal is to approximate the length 
of the shortest nonzero lattice vector. The best known polynomial time algorithms 
for them yield only mildly subexponential approximation factors [Lenstra et al. 
1982; Schnorr 1987; Ajtai et al. 2001]. It is conjectured that there is no classical 
(i.e., nonquantum) polynomial time algorithm that approximates them to within 
any polynomial factor. Lattice-based constructions of one-way functions, such as 
the one by Ajtai [2004], are based on this conjecture. 
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One might even conjecture that there is no guantum polynomial time algorithm 
that approximates GAPS VP (or SIVP) to within any polynomial factor. One can then 
interpret the main theorem as saying that based on this conjecture, the LWE problem 
is hard. The only evidence supporting this conjecture is that there are no known 
quantum algorithms for lattice problems that outperform classical algorithms, even 
though this is probably one of the most important open questions in the field of 
quantum computing.! 

In fact, one could also interpret our main theorem as a way to disprove this 
conjecture: if one finds an efficient algorithm for LWE, then one also obtains a 
quantum algorithm for approximating worst-case lattice problems. Such a result 
would be of tremendous importance on its own. Finally, we note that it is possible 
that our main theorem will one day be made classical. This would make all our 
results stronger and the above discussion unnecessary. 

The LWE problem can be equivalently presented as the problem of decoding 
random linear codes. More specifically, let m = poly() be arbitrary and let s € Z, 
be some vector. Then, consider the following problem: given a random matrix 
QeZ xn and the vector t = Qs +e € Z', where each coordinate of the error vector 


e € Zh is chosen independently from W,,, recover s. The Hamming weight of e is 


roughly m(1 — 1/(a@p)) (since a value chosen from W, is 0 with probability roughly 
1/(ap)). Hence, the Hamming distance of t from Qs is roughly m(1 — 1/(ap)). 
Moreover, it can be seen that for large enough m, for any other word s’, the Hamming 
distance of t from Qs’ is roughly m(1 — 1/p). Hence, we obtain that approximating 
the nearest codeword problem to within factors smaller than (1 — 1/p)/(1—1/(a@p)) 
on random codes is as hard as quantumly approximating worst-case lattice prob- 
lems. This gives a partial answer to the important open question of understanding 
the hardness of decoding from random linear codes. 

It turns out that certain problems, which are seemingly easier than the LWE 
problem, are in fact equivalent to the LWE problem. We establish these equivalences 
in Section 4 using elementary reductions. For example, being able to distinguish 
a set of equations as above from a set of equations in which the b;’s are chosen 
uniformly from Z, is equivalent to solving LWE. Moreover, it is enough to correctly 
distinguish these two distributions for some non-negligible fraction of all s. The 
latter formulation is the one we use in our cryptographic applications. 


Cryptosystem. In Section 5, we present a public key cryptosystem and prove 
that it is secure based on the hardness of the LWE problem. We use the standard 
security notion of semantic, or IND-CPA, security (see, e.g., Katz and Lindell 
[2008, Chap. 10]). The cryptosystem and its security proof are entirely classical. In 
fact, the cryptosystem itself is quite simple; the reader is encouraged to glimpse at 
the beginning of Section 5. Essentially, the idea is to provide a list of equations as 
above as the public key; encryption is performed by summing some of the equations 
(forming another equation with error) and modifying the right hand side depending 
on the message to be transmitted. Security follows from the fact that a list of 
equations with error is computationally indistinguishable from a list of equations 
in which the b;’s are chosen uniformly. 


1 Tf forced to make a guess, the author would say that the conjecture is true. 
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By using our main theorem, we obtain that the security of the system is based 
also on the worst-case quantum hardness of approximating SIVP and GAPS VP 
to within Õ(n!5). In other words, breaking our cryptosystem implies an efficient 
quantum algorithm for approximating SIVP and GAPSVP to within O(n!»). Previ- 
ous cryptosystems, such as the Ajtai-Dwork cryptosystem [Ajtai and Dwork 1997] 
and the one by Regev [2004], are based on the worst-case (classical) hardness of 
the unique-S VP problem, which can be related to GAPS VP (but not SIVP) through 
the recent result of Lyubashevsky and Micciancio [2009]. 

Another important feature of our cryptosystem is its improved efficiency. In 
previous cryptosystems, the public key size is O(n*) and the encryption increases 
the size of messages by a factor of O(n’). In our cryptosystem, the public key size is 
only O(n”) and encryption increases the size of messages by a factor of only O(n). 
This possibly makes our cryptosystem practical. Moreover, using an idea of Ajtai 
[2005], we can reduce the size of the public key to O(n). This requires all users of 
the cryptosystem to share some (trusted) random bit string of length O(n”). This 
can be achieved by, say, distributing such a bit string as part of the encryption and 
decryption software. 

We mention that learning problems similar to ours were already suggested as 
possible sources of cryptographic hardness in, for example, Blum et al. [1994] and 
Alekhnovich [2003], although this was done without establishing any connection 
to lattice problems. In another related work, Ajtai [2005] suggested a cryptosystem 
that has several properties in common with ours (including its efficiency), although 
its security is not based on worst-case lattice problems. 


Why quantum? This article is almost entirely classical. In fact, quantum is needed 
only in one step in the proof of the main theorem. Making this step classical would 
make the entire reduction classical. To demonstrate the difficulty, consider the 
following situation. Let L be some lattice and let d=,(L)/ n'° where A,(L) is 
the length of the shortest nonzero vector in L. We are given an oracle that for any 
point x € R” within distance d of L finds the closest lattice vector to x. If x is not 
within distance d of L, the output of the oracle is undefined. Intuitively, such an 
oracle seems quite powerful; the best known algorithms for performing such a task 
require exponential time. Nevertheless, we do not see any way to use this oracle 
classically. Indeed, it seems to us that the only way to generate inputs to the oracle 
is the following: somehow choose a lattice point y € L and let x = y + z for some 
perturbation vector z of length at most d. Clearly, on input x the oracle outputs y. 
But this is useless since we already know y! 

It turns out that quantumly, such an oracle is quite useful. Indeed, being able 
to compute y from x allows us to uncompute y. More precisely, it allows us to 
transform the quantum state |x, y) to the state |x, 0) in a reversible (i.e., unitary) 
way. This ability to erase the contents of a memory cell in a reversible way seems 
useful only in the quantum setting. 


Techniques. Unlike previous constructions of lattice-based public-key cryptosys- 
tems, the proof of our main theorem uses an “iterative construction”. Essentially, 
this means that instead of ‘immediately’ finding very short vectors in a lattice, the 
reduction proceeds in steps where in each step shorter lattice vectors are found. So 
far, such iterative techniques have been used only in the construction of lattice-based 
one-way functions [Ajtai 2004; Cai and Nerurkar 1997; Micciancio 2004; Miccian- 
cio and Regev 2007]. Another novel aspect of our main theorem is its crucial use of 
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quantum computation. Our cryptosystem is the first classical cryptosystem whose 
security is based on a quantum hardness assumption (see Moore et al. [2007] for a 
somewhat related recent work). 

Our proof is based on the Fourier transform of Gaussian measures, a technique 
that was developed in previous papers [Regev 2004; Micciancio and Regev 2007; 
Aharonov and Regev 2005]. More specifically, we use a parameter known as the 
smoothing parameter, as introduced in Micciancio and Regev [2007]. We also use 
the discrete Gaussian distribution and approximations to its Fourier transform, ideas 
that were developed in Aharonov and Regev [2005]. 


Open questions. The main open question raised by this work is whether The- 
orem 1.1 can be dequantized: can the hardness of LWE be established based on 
the classical hardness of SIVP and GAPSVP? We see no reason why this should 
be impossible. However, despite our efforts over the last few years, we were not 
able to show this. As mentioned above, the difficulty is that there seems to be no 
classical way to use an oracle that solves the closest vector problem within small 
distances. Quantumly, however, such an oracle turns out to be quite useful. 

Another important open question is to determine the hardness of the learning 
from parity with errors problem (i.e., the case p = 2). Our theorem only works for 
p > 2,/n. It seems that in order to prove similar results for smaller values of p, 
substantially new ideas are required. Alternatively, one can interpret our inability 
to prove hardness for small p as an indication that the problem might be easier than 
believed. 

Finally, it would be interesting to relate the LWE problem to other average-case 
problems in the literature, and especially to those considered by Feige [2002]. See 
Alekhnovich [2003] for some related work. 


Followup work. We now describe some of the followup work that has appeared 
since the original publication of our results in Regev [2005]. 

One line of work focussed on improvements to our cryptosystem. First, Kawachi 
et al. [2007] proposed a modification to our cryptosystem that slightly improves 
the encryption blowup to O(n), essentially getting rid of a log factor. A much 
more significant improvement is described by Peikert et al. [2008]. By a relatively 
simple modification to the cryptosystem, they managed to bring the encryption 
blowup down to only O(1), in addition to several equally significant improvements 
in running time. Finally, Akavia et al. [2009] show that our cryptosystem remains 
secure even if almost the entire secret key is leaked. 

Another line of work focussed on the design of other cryptographic protocols 
whose security is based on the hardness of the LWE problem. First, Peikert and 
Waters [2008] constructed, among other things, CCA-secure cryptosystems (see 
also Peikert [2009] for a simpler construction). These are cryptosystems that are 
secure even if the adversary is allowed access to a decryption oracle (see, e.g., Katz 
and Lindell [2008, Chap. 10]). All previous lattice-based cryptosystems (including 
the one in this article) are not CCA-secure. Second, Peikert et al. [2008] showed 
how to construct oblivious transfer protocols, which are useful, for example, for per- 
forming secure multiparty computation. Third, Gentry et al. [2008] constructed an 
identity-based encryption (IBE) scheme. This is a public-key encryption scheme in 
which the public key can be any unique identifier of the user; very few constructions 
of such schemes are known. Finally, Cash et al. [2009] constructed a public-key 
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cryptosystem that remains secure even when the encrypted messages may depend 
upon the secret key. The security of all the above constructions is based on the LWE 
problem and hence, by our main theorem, also on the worst-case quantum hardness 
of lattice problems. 

The LWE problem has also been used by Klivans and Sherstov [2009] to show 
hardness results related to learning halfspaces. As before, due to our main theo- 
rem, this implies hardness of learning halfspaces based on the worst-case quantum 
hardness of lattice problems. 

Finally, we mention two results giving further evidence for the hardness of the 
LWE problem. In the first, Peikert [2008] somewhat strengthens our main theorem 
by replacing our worst-case lattice problems with their analogues for the £4 norm, 
where 2 <q < œ is arbitrary. Our main theorem only deals with the standard £2 
versions. 

In another recent result, Peikert [2009] shows that the quantum part of our proof 
can be removed, leading to a classical reduction from GAPS VP to the LWE problem. 
As aresult, Peikert is able to show that public-key cryptosystems (including many of 
the above LWE-based schemes) can be based on the classical hardness of GAPSVP, 
resolving a long-standing open question (see also Lyubashevsky and Micciancio 
[2009]). Roughly speaking, the way Peikert circumvents the difficulty we described 
earlier is by noticing that the existence of an oracle that is able to recover y from 
y + z, where y is a random lattice point and z is a random perturbation of length 
at most d, is by itself a useful piece of information as it provides a lower bound 
on the length of the shortest nonzero vector. By trying to construct such oracles 
for several different values of d and checking which ones work, Peikert is able to 
obtain a good approximation of the length of the shortest nonzero vector. 

Removing the quantum part, however, comes at a cost: the construction can no 
longer be iterative, the hardness can no longer be based on SIVP, and even for hard- 
ness based on GAPSVP, the modulus p in the LWE problem must be exponentially 
big unless we assume the hardness of a nonstandard variant of GAPSVP. Because 
of this, we believe that dequantizing our main theorem remains an important open 
problem. 


1.1. OVERVIEW. In this section, we give a brief informal overview of the proof 
of our main theorem, Theorem 1.1. The complete proof appears in Section 3. We 
do not discuss here the reductions in Section 4 and the cryptosystem in Section 5 
as these parts of the article are more similar to previous work. 

In addition to some very basic definitions related to lattices, we will make heavy 
use here of the discrete Gaussian distribution on L of width r , denoted Dz ,.. This 
is the distribution whose support is L (which is typically a lattice), and in which 
the probability of each x € L is proportional to exp (—x x/r I?) (see Eq. (6) and 
Figure 2). We also mention here the smoothing parameter ne(L). This is a real 
positive number associated with any lattice L (€ is an accuracy parameter which 
we can safely ignore here). Roughly speaking, it gives the smallest r starting from 
which Dz „ “behaves like” a continuous Gaussian distribution. For instance, for 
r > Ne(L), vectors chosen from D; „ have norm roughly r/n with high probability. 
In contrast, for sufficiently small r, Dz; gives almost all its mass to the origin 0. 
Although not required for this section, a complete list of definitions can be found 
in Section 2. 
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Fic.2. Dz.» (left) and Dz ı (right) for a two-dimensional lattice L. The z-axis represents probability. 


Let a, p, n be such that wp > 2./n, as required in Theorem 1.1, and assume we 
have an oracle that solves LWE, »,. For concreteness, we can think of p = n? and 
a =1/n. Our goal is to show how to solve the two lattice problems mentioned in 
Theorem 1.1. As we prove in Subsection 3.3 using standard reductions, it suffices 
to solve the following discrete Gaussian sampling problem (DGS): Given an n- 
dimensional lattice L and a number r > /2n- n.(L)/a, output a sample from Dz. 
Intuitively, the connection to GAPSVP and SIVP comes from the fact that by taking 
r close to its lower limit /2n - n-(L)/a, we can obtain short lattice vectors (of 
length roughly ./nr). In the rest of this section, we describe our algorithm for 
sampling from Dz ,. We note that the exact lower bound on r is not that important 
for purposes of this overview, as it only affects the approximation factor we obtain 
for GAPSVP and SIVP. It suffices to keep in mind that our goal is to sample from 
Dz, for r that is rather small, say within a polynomial factor of ne(L). 

The core of the algorithm is the following procedure, which we call the “iterative 
step”. Its input consists of a number r (which is guaranteed to be not too small, 
namely, greater than V2pne(L)), and n° samples from Dz, where c is some con- 
stant. Its output is a sample from the distribution Dz „ for r’ =r./n/(ap). Notice 
that since ap >2,/n, r' < r/2. In order to perform this “magic” of converting 
vectors of norm ./nr into shorter vectors of norm ./nr’, the procedure of course 
needs to use the LWE oracle. 

Given the iterative step, the algorithm for solving DGS works as follows. Let r; 
denote r - (wp/./n)'. The algorithm starts by producing n° samples from Dy s,- 
Because r3, is so large, such samples can be computed efficiently by a simple 
procedure described in Lemma 3.2. Next comes the core of the algorithm: for 


i =3n, 3n —1,...,1 the algorithm uses its n° samples from Dz, to produce n° 
samples from D;,,,_, by calling the iterative step n° times. Eventually, we end 
up with n° samples from Dz; ,, = DL, and we complete the algorithm by simply 


outputting the first of those. Note the following crucial fact: using n° samples from 
DL., we are able to generate the same number of samples n° from Dz ,,_, (in fact, 
we could even generate more than n° samples). The algorithm would not work if 
we could only generate, say, n° /2 samples, as this would require us to start with an 
exponential number of samples. 

We now finally get to describe the iterative step. Recall that as input we have n° 
samples from Dz „ and we are supposed to generate a sample from Dz, where 
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n° samples 
from Dr, 


? Uses We 
solve 
CVP r+ ap/r | 
gp? 


solve 
CVP 5+ (ap)? /(rvin) 


n° samples 
from Dir Va) 


n° samples 
from Dy yn] (op)? 


FIG. 3. Two iterations of the algorithm. 


r'=r,/n/(ap). Moreover, r is known and guaranteed to be at least V2pne(L), 
which can be shown to imply that p/r <2,(L*)/2. As mentioned above, the exact 
lower bound on r does not matter much for this overview; it suffices to keep in 
mind that r is sufficiently larger than ņe(L), and that 1/r is sufficiently smaller than 
A\(L*). 

The iterative step is obtained by combining two parts (see Figure 3). In the 
first part, we construct a classical algorithm that uses the given samples and the 
LWE oracle to solve the following closest vector problem, which we denote by 
CVPz* ap/r: given any point x € R” within distance wp/r of the dual lattice L*, 
output the closest vector in L* to x.? By our assumption on r, the distance between 
any two points in L* is greater than 2ap/r and hence the closest vector is unique. 
In the second part, we use this algorithm to generate samples from D__,. This part 
is quantum (and in fact, the only quantum part of our proof). The idea here is to use 
the CVP; «»/- algorithm to generate a certain quantum superposition which, after 
applying the quantum Fourier transform and performing a measurement, provides 
us with a sample from Dz. 7p). In the following, we describe each of the two 
parts in more detail. 


Part 1. We start by recalling the main idea in Aharonov and Regev [2005]. 
Consider some probability distribution D on some lattice L and consider its Fourier 
transform f : R” — C, defined as 


f(x) = J Diyexp 2ri (x, y)) = Explexp (277i (x, ¥))] 
yeL y~ 


where in the second equality we simply rewrite the sum as an expectation. By 
definition, f is L*-periodic, that is, f(x) = f(x + y) for any xe R” and ye L*. 
In Aharonov and Regev [2005] it was shown that given a polynomial number of 
samples from D, one can compute an approximation of f to within +1 /poly(n). To 
see this, note that by the Chernoff—Hoeffding bound, ify;,..., yy are N = poly(n) 


2 In fact, we only solve CVP; apar) but for simplicity we ignore the factor V2 here. 
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Fic. 4. fiyr for a two-dimensional lattice. 


independent samples from D, then 


1 Ww 
f(®) & N X exp (2xi(x, y;)) 


j=l 


where the approximation is to within +1 /poly(7) and holds with probability expo- 
nentially close to 1, assuming that N is a large enough polynomial. 

By applying this idea to the samples from Dz, given to us as input, we obtain a 
good approximation of the Fourier transform of Dz ,, which we denote by fiyr. It 
can be shown that since 1/r «< A,(L*) one has the approximation 


fiyr(&) © exp (=x (r - dist(L*, x))’) (1) 


(see Figure 4). Hence, f\;,(x) ~ 1 for any x € L* (in fact an equality holds) and 
as one gets away from L%*, its value decreases. For points within distance, say, 1/r 
from the lattice, its value is still some positive constant (roughly exp (—7:)). As 
the distance from L* increases, the value of the function soon becomes negligible. 
Since the distance between any two vectors in L* is at least A\(L*) > 1/r, the 
Gaussians around each point of L* are well separated. 

Although not needed in this article, let us briefly outline how one can solve 
CVP + 1/r using samples from D__,. Assume that we are given some point x within 
distance 1/r of L*. Intuitively, this x is located on one of the Gaussians of fi/r. 
By repeatedly computing an approximation of fı;, using the samples from D_ , as 
described above, we “walk uphill” on fi; in an attempt to find its “peak”. This peak 
corresponds to the closest lattice point to x. Actually, the procedure as described 
here does not quite work: due to the error in our approximation of fij, we cannot 
find the closest lattice point exactly. It is possible to overcome this difficulty; see Liu 
et al. [2006] for the details. The same procedure actually works for slightly longer 
distances, namely O(./log n/r), but beyond that distance the value of f;/, becomes 
negligible and no useful information can be extracted from our approximation of it. 
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FIG. 5. The Fourier transform of D;+,a/p,./p With n = 2, p = 2, a = (0, 0) (left), a = (1, 1) (right). 


Unfortunately, solving CVPz» 1/, is not useful for the iterative step as it would 
lead to samples from Dz „m, which is a wider rather than a narrower distribution 
than the one we started with. This is not surprising, since our solution to CVPz» 1/7 
did not use the LWE oracle. Using the LWE oracle, we will now show that we can 
gain an extra ap factor in the radius, and obtain the desired CVPz» «p/r algorithm. 

Notice that if we could somehow obtain samples from Dz ,-/» we would be done: 
using the procedure described above, we could solve CVP_:,,/,, which is better 
than what we need. Unfortunately, it is not clear how to obtain such samples, even 
with the help of the LWE oracle. Nevertheless, here is an obvious way to obtain 
something similar to samples from Dz _,/p: just take the given samples from Dz, 
and divide them by p. This provides us with samples from D7 /p,-/p) where L/p is 
the lattice L scaled down by a factor of p. In the following, we will show how to 
use these samples to solve CVP7« ap/-- 

Let us first try to understand what the distribution DŁ/p,r/p looks like. Notice 
that the lattice L/p consists of p” translates of the original lattice L. Namely, for 
eacha € Zi consider the set 


L + La/p = {Lb/p |b € Z”, bmod p = a}. 


Then, {L + La/p | a € Z} forms a partition of L/p. Moreover, it can be shown 
that since r /p is larger than the smoothing parameter ne(L), the probability given 
to each L + La/p under DŁ/p,r/p is essentially the same, that is, p™”. Intuitively, 
beyond the smoothing parameter, the Gaussian measure no longer “sees” the dis- 
crete structure of L, so in particular it is not affected by translations (this will be 
shown in Claim 3.8). 

This leads us to consider the following distribution, call it D. A sample from D is 
a pair (a, y) where y is sampled from Dz /p,-/p, anda € Zi is such that y € L+La/p. 
Notice that we can easily obtain samples from D using the given samples from Dz „r. 
From the above discussion we have that the marginal distribution of a is essentially 
uniform. Moreover, by definition we have that the distribution of y conditioned on 
any ais DŁ+La/p,r/p- Hence, D is essentially identical to the distribution on pairs 
(a, y) in which a € Z} is chosen uniformly at random, and then y is sampled from 
D +La/p,r/p- From now on, we think of Das being this distribution. 

We now examine the Fourier transform of Dy +7a/p,r/p (see Figure 5). When a is 
zero, we already know that the Fourier transform is f,/,. For general a, a standard 
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calculation shows that the Fourier transform of DŁ+La/p,r/p 18 given by 


exp ri (a, T(x))/P) > fpr © (2) 


where T(x) € Zp is defined as 
t(x) := (L*)"!«,«(x) mod p, 


and «,+(x) denotes the (unique) closest vector in L* to x. In other words, t(x) is 
the vector of coefficients of the vector in L* closest to x when represented in the 
basis of L*, reduced modulo p. So we see that the Fourier transform D;+:a/p,r/p iS 
essentially f,/,, except that each “hill” gets its own phase depending on the vector 
of coefficients of the lattice point in its center. The appearance of these phases is as 
a result of a well-known property of the Fourier transform, saying that translation 
is transformed to multiplication by phase. 

Equipped with this understanding of the Fourier transform of DŁ+La/p,r/p, we can 
get back to our task of solving CVPz+ «œp/r . By the definition of the Fourier transform, 
we know that the average of exp (277i (x, y)) over y ~ Dz 41a/p,r/p 1S given by (2). 
Assume for simplicity that x € L* (even though in this case finding the closest vector 
is trivial; it is simply x itself). In this case, (2) is equal to exp (277/ (a, T(x))/p). 
Since the absolute value of this expression is 1, we see that for such x, the random 
variable (x, y) mod 1 (where y ~ DŁ+La/p,r/p) must be deterministically equal to 
(a, T(x))/p mod 1 (this fact can also be seen directly). In other words, when x € L*, 
each sample (a, y) from D, provides us with a linear equation 


(a, T(x)) = p(x, y) mod p 


with a distributed essentially uniformly in Z}. After collecting about n such equa- 
tions, we can use Gaussian elimination to recover T(x) € Z”. And as we shall show 
in Lemma 3.5 using a simple reduction, the ability to compute T(x) easily leads to 
the ability to compute the closest vector to x. 

We now turn to the more interesting case in which x is not in L*, but only 
within distance ap/r of L*. In this case, the phase of (2) is still equal to 
exp (277i (a, T(x))/p). Its absolute value, however, is no longer 1, but still quite 
close to 1 (depending on the distance of x from L*). Therefore, the random 
variable (x, y) mod 1, where y ~ Dz 4, ,a/p,r/p, must be typically quite close to 
(a, T(x))/p mod 1 (since, as before, the average of exp (277i (x, y)) is given by (2)). 
Hence, each sample (a, y) from D, provides us with a linear equation with error, 


(a, T(x)) © Lp(x, y)] mod p. 


Notice that p(x, y) is typically not an integer and hence we round it to the nearest 
integer. After collecting a polynomial number of such equations, we call the LWE 
oracle in order to recover T(x). Notice that a is distributed essentially uniformly, 
as required by the LWE oracle. Finally, as mentioned above, once we are able to 
compute T(x), computing x is easy (this will be shown in Lemma 3.5). 

The above outline ignores one important detail: what is the error distribution in 
the equations we produce? Recall that the LWE oracle is only guaranteed to work 
with error distribution W,,. Luckily, as we will show in Claim 3.9 and Corollary 3.10 
(using arather technical proof), if xis at distance Bp/r from L* forsome0 < £ < a, 
then the error distribution in the equations is essentially V,. (In fact, in order to get 
this error distribution, we will have to modify the procedure a bit and add a small 
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amount of normal error to each equation.) We then complete the proof by noting 
(in Lemma 3.7) that an oracle for solving LWE,, j, can be used to solve LWE 
for any 0 < £ < a (even if f is unknown). 


p- Vg 


Part 2. In this part, we describe a quantum algorithm that, using a CVPz» ap/r 
oracle, generates one sample from Dz + 7/(qp). Equivalently, we show how to pro- 
duce a sample from Dz, given a CVPz» ymyr oracle. The procedure is essentially 
the following: first, by using the CVP oracle, create a quantum state corresponding 
to fi/,. Then, apply the quantum Fourier transform and obtain a quantum state 
corresponding to Dz „. By measuring this state, we obtain a sample from Dz. 

In the following, we describe this procedure in more detail. Our first goal is to 
create a quantum state corresponding to f;/,. Informally, this can be written as 


YS fils). (3) 
xeR” 


This state is clearly not well defined. In the actual procedure, R” is replaced with 
some finite set (namely, all points inside the basic parallelepiped of L* that belong 
to some fine grid). This introduces several technical complications and makes the 
computations rather tedious. Therefore, in the present discussion, we opt to continue 
with informal expressions as in (3). 

Let us now continue our description of the procedure. In order to prepare the 
state in (3), we first create the uniform superposition on L*, 


i |x). 
xeL* 


(This step is actually unnecessary in the real procedure, since there we work in the 
basic parallelepiped of L*; but for the present discussion, it is helpful to imagine 
that we start with this state.) On a separate register, we create a “Gaussian state” of 


width 1/r, 
X exp (=z Irz\’) 2). 


zeR” 


This can be done using known techniques. The combined state of the system can 


be written as 
7. exp (—x IIrzll?) |x, Z). 


xeL*,zeR” 
We now add the first register to the second (a reversible operation), and obtain 

> exp (—z Iz’) |x, x + Z). 

xeL*,zeR” 
Finally, we would like to erase, or uncompute, the first register to obtain 
2 yN 
XO exp(-rlrzl?)ix +2) ~ O filz). 
xeL* zeR” zeR” 


However, “erasing” a register is in general not a reversible operation. In order for 
it to be reversible, we need to be able to compute x from the remaining register 
x + z. This is precisely why we need the CVP;+ n/r oracle. It can be shown that 


almost all the mass of exp (—x Irz?) is on z such that ||z|| < ./n/r. Hence, x + z 
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is within distance ./n/r of the lattice and the oracle finds the closest lattice point, 
namely, x. This allows us to erase the first register in a reversible way. 

In the final part of the procedure, we apply the quantum Fourier transform. This 
yields the quantum state corresponding to Dz, namely, 


X Di ()ly)- 


yeL 


By measuring this state, we obtain a sample from the distribution Dz, (or in fact 
from D? =D; r/J2 but this is a minor issue). 


2. Preliminaries 


In this section, we include some notation that will be used throughout the article. 
Most of the notation is standard. Some of the less standard notation is: the Gaussian 
function p (Eq. (4)), the Gaussian distribution v (Eq. (5)), the periodic normal 
distribution ¥ (Eq. (7)), the discretization of a distribution on T (Eq. (8)), the 
discrete Gaussian distribution D (Eq. (6)), the unique closest lattice vector k (above 
Lemma 2.3), and the smoothing parameter 7 (Definition 2.10). 


General. For two real numbers x and y > 0, we define x mod y as x — |x/y]y. 
For x € R, we define |x] as the integer closest to x or, in case two such integers 
exist, the smaller of the two. For any integer p > 2, we write Z, for the cyclic group 
{0,1,..., p — 1} with addition modulo p. We also write T for R/Z, that is, the 
segment [0, 1) with addition modulo 1. 

We define a negligible amount in n as an amount that is asymptotically smaller 
than n~° for any constant c > 0. More precisely, f(n) is a negligible function in 
n if lim, n° f(n) = 0 for any c > 0. Similarly, a non-negligible amount is one 
which is at least n™® for some c > 0. Also, when we say that an expression is 
exponentially small in n we mean that it is at most 2~”), Finally, when we say 
that an expression (most often, some probability) is exponentially close to 1, we 
mean that it is 1 — 2720, 

We say that an algorithm A with oracle access is a distinguisher between two 
distributions if its acceptance probability when the oracle outputs samples of the 
first distribution and its acceptance probability when the oracle outputs samples of 
the second distribution differ by a nonnegligible amount. 

Essentially all algorithms and reductions in this article have an exponentially 
small error probability, and we sometimes do not state this explicitly. 

For clarity, we present some of our reductions in a model that allows operations 
on real numbers. It is possible to modify them in a straightforward way so that 
they operate in a model that approximates real numbers up to an error of 2™™ for 
arbitrary large constant c in time polynomial in n. 

Given two probability density functions ¢), ¢2 on R”, we define the statistical 
distance between them as 


A(b1, 2) = / (u(x) — oo(x) [dx 


n 


(notice that with this definition, the statistical distance ranges in [0, 2]). A simi- 
lar definition can be given for discrete random variables. The statistical distance 
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satisfies the triangle inequality, that is, for any ¢1, ¢2, $3, 
ACQ, 3) < Aloi, G2) + Alh, $3). 


Another important fact that we often use is that the statistical distance cannot 
increase by applying a (possibly randomized) function f, that is, 


AX), FW) < A, Y), 


see, for example, Micciancio and Goldwasser [2002]. In particular, this implies 
that the acceptance probability of any algorithm on inputs from X differs from 
its acceptance probability on inputs from Y by at most 5 A(X , Y) (the factor half 
coming from the choice of normalization in our definition of A). 


Gaussians and other distributions. Recall that the normal distribution with 
mean 0 and variance ø? is the distribution on R given by the density function 


Tes exp (- 5(+)?) where exp (y) denotes e”. Also recall that the sum of two inde- 


pendent normal variables with mean 0 and variances or and Gy is anormal variable 
with mean 0 and variance o? + Ge. For a vector x and any s > 0, let 


psx) := exp (—7||x/s ||’) (4) 


be a Gaussian function scaled by a factor of s. We denote pı by p. Note that 
Jeg: Pax = s”. Hence, 


Vs = ps / s” (5) 


is an n-dimensional probability density function and as before, we use v to denote v4. 
The dimension n is implicit. Notice that a sample from the Gaussian distribution vs 
can be obtained by taking n independent samples from the 1-dimensional Gaussian 
distribution. Hence, sampling from v, to within arbitrarily good accuracy can be 
performed efficiently by using standard techniques. For simplicity, in this article 
we assume that we can sample from v, exactly.’ Functions are extended to sets in 
the usual way; that is, o,(A) = Yee A Ps(X) for any countable set A. For any vector 
c € R”, we define ps ¢(x) := s(x — c) to be a shifted version of ps. The following 
simple claim bounds the amount by which p,(x) can shrink by a small change 
in x. 


CLAIM 2.1. Foralls,t,l > 0 andx,y € R” with ||x|| < t and ||x — y|| < L, 
ps(y) > (l — alt +1°)/s*)ps(x). 
PROOF. Using the inequality e~* > 1 — z, 


> (1 — wQit + I?)/s)p5(x). 


= 2 = 2 = 2 2 
ps(y) =e tlly/s|| >e a(x /s+/s) — e77 +0/s) ) 05(X) 


3 In practice, when only finite precision is available, v, can be approximated by picking a fine grid, 
and picking points from the grid with probability approximately proportional to v,. All our arguments 
can be made rigorous by selecting a sufficiently fine grid. 
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For any countable set A and a parameter s > 0, we define the discrete Gaussian 
probability distribution D 4.5 as 


psx) 


Vx € A, D4 (X) := i 
^ ps(A) 


(6) 


See Figure 2 for an illustration. 
For 6 €R* the distribution Wg is the distribution on T obtained by sampling 


from a normal variable with mean 0 and standard deviation + and reducing the 
result modulo 1 (i.e., a periodization of the normal distribution), 


Yr €[0, 1), Wg(r) = 3 a (-7(5*)’). (7) 


k=—00 


Clearly, one can efficiently sample from Yg. The following technical claim shows 
that a small change in the parameter 6 does not change the distribution Yg by 
much. 


CLAIM 2.2. Forany0 <a < B<2a, 


A(Wy, Us) < o(Ë = 1). 


PROOF. We will show that the statistical distance between a normal variable 
with standard deviation a/./2z and one with standard deviation B//2z is at most 
Je — 1). This implies the claim since applying a function (modulo 1 in this case) 
cannot increase the statistical distance. By scaling, we can assume without loss of 
generality that œ = 1 and £ = 1 +€ for some 0 < e < 1. Then, the statistical distance 
that we wish to bound is given by 
J —mx? 1 —mx?/(1+e€)* 

e ———e 
R 1 +e 


<f Jem = emar ax + f (1- 1 N 
> R R l+e 


ee Ee 2 
z. le 1A on e mx” /(1+e) 
R 


= = 2) 12 ews 2 
=f le m(1—1/(1+e)2)x —1le m/e? gy 4 e. 
R 


dx 


dx 


dx +e 


Now, since 1 — z < e * <1 forall z > 0, 
[e = 1| < (l = 1/(1 Ae €)*)x? < Qnex?. 


Hence we can bound the statistical distance above by 


e+ are | xe MF dy = e + e(l te) < 9e. 
R 


For an arbitrary probability distribution with density function @ : T > R* and 
some integer p > 1 we define its discretization @ : Zp —> R* as the discrete prob- 
ability distribution obtained by sampling from ¢, multiplying by p, and rounding 
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to the closest integer modulo p. More formally, 


B (i+1/2)/p 
oO) Í pds: 8) 
( 


i—1/2)/p 


As an example, Wg is shown in Figure 1. 

Let p > 2 be some integer, and let x : Z, — R* be some probability distribu- 
tion on Zp. Let n be an integer and let s € Z7, be a vector. We define As, as the 
distribution on Z7, x Zp obtained by choosing a vector a € Z7, uniformly at random, 
choosing e € Z, according to x, and outputting (a, (a, s) + e), where additions are 
performed in Zp, that is, modulo p. We also define U as the uniform distribution on 
Zi, X Zp. 

For a probability density function ¢ on T, we define A, 4 as the distribution on 
Z”, x T obtained by choosing a vector a € Z” uniformly at random, choosing e € T 
according to ¢, and outputting (a, (a, s)/p + e), where the addition is performed 
in T, that is, modulo 1. 


Learning with Errors. For an integer p = p(n) and a distribution x on Zp, we 
say that an algorithm solves LWE,, , if, for any s € Z", given samples from Ag , 
it outputs s with probability exponentially close to 1. Similarly, for a probability 
density function ¢ on T, we say that an algorithm solves LWE, ¢ if, for any s € Z”, 
given samples from A, it outputs s with probability exponentially close to 1. Ín 
both cases, we say that the algorithm is efficient if it runs in polynomial time in n. 
Finally, we note that p is assumed to be prime only in Lemma 4.2; In the rest of 
the article, including the main theorem, p can be an arbitrary integer. 


Lattices. We briefly review some basic definitions; for a good introduction to 
lattices, see Micciancio and Goldwasser [2002]. A lattice in R” is defined as the set 
of all integer combinations of n linearly independent vectors. This set of vectors 
is known as a basis of the lattice and is not unique. Given a basis (v1, ..., Vn) 
of a lattice L, the fundamental parallelepiped generated by this basis is defined 
as 


n 


P(Vi,..., Vn) = [Dav 


i=1 


xi € [0, o] F 


When the choice of basis is clear, we write ?(L) instead of P(v1,...,V,). For 
a point xe R” we define x mod P(L) as the unique point y e P(L) such that 
y — x€L. We denote by det(L) the volume of the fundamental parallelepiped 
of L or equivalently, the absolute value of the determinant of the matrix whose 
columns are the basis vectors of the lattice (det(L) is a lattice invariant, that is, 
it is independent of the choice of basis). The dual of a lattice L in IR", denoted 
L*, is the lattice given by the set of all vectors y € R” such that (x, y) € Z for all 
vectors x € L. Similarly, given a basis (v1, ..., Vn) of a lattice, we define the dual 
basis as the set of vectors (vj, ..., vš) such that (v;, vi) = 6;; for alli, j € [n] where 
6;; denotes the Kronecker delta, that is, 1 if i = j and 0 otherwise. With a slight 
abuse of notation, we sometimes write L for the n x n matrix whose columns are 
Vi, ..., Vn. With this notation, we notice that L* = (L7)~!. From this, it follows 
that det(L*) = 1/det(L). As another example of this notation, for a point v € L 
we write L~'v to indicate the integer coefficient vector of v. 
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Let 4,(L) denote the length of the shortest nonzero vector in the lattice L. We 
denote by A,,(L) the minimum length of a set of n linearly independent vectors from 
L, where the length of a set is defined as the length of longest vector in it. For a 
lattice L and a point v whose distance from L is less than A,(L)/2 we define «z (v) 
as the (unique) closest point to v in L. The following useful fact, due to Banaszczyk, 
is known as a “transference theorem”. We remark that the lower bound is easy to 
prove. 


LEMMA 2.3 (BANASZCZYK 1993,THEOREM 2.1). For any n-dimensional lat- 
tice L, 1 <A\(L)-Ay,(L*) <n. 


Two other useful facts by Banaszczyk are the following. The first bounds the 
amount by which the Gaussian measure of a lattice changes by scaling; the second 
shows that for any lattice L, the mass given by the discrete Gaussian measure Dz 
to points of norm greater than ,/nr is at most exponentially small (the analogous 
statement for the continuous Gaussian v, is easy to establish). 


LEMMA 2.4 (BANASZCZYK 1993, LEMMA 1.4(1)). For any lattice L anda > 1, 
pa(L) < a" pL), 


LEMMA 2.5 (BANASZCZYK 1993, LEMMA 1.5(1)). Let By, denote the Euclidean 
unit ball. Then, for any lattice L and anyr > 0, p,(L\/nr Bn) < 2-2". o,.(L), where 
L \ Jnr B, is the set of lattice points of norm greater than ./nr. 


In this article, we consider the following lattice problems. The first two, the deci- 
sion version of the shortest vector problem (GAPSVP) and the shortest independent 
vectors problem (SIVP), are among the most well-known lattice problems and are 
concerned with A, and àn, respectively. In the definitions below, y = y(n) > 1 is 
the approximation factor, and the input lattice is given in the form of some arbitrary 
basis. 


Definition 2.6. An instance of GAPSVP,, is given by an n-dimensional lattice 
L and a number d >0. In YES instances, 4;(L) <d whereas in NO instances 
A\V(L) > y(n) -d. 


Definition 2.7. An instance of SIVP,, is given by an n-dimensional lattice L. 
The goal is to output a set of n linearly independent lattice vectors of length at most 


y(n) - An(L). 


A useful generalization of SIVP is the following somewhat less standard lattice 
problem, known as the generalized independent vectors problem (GIVP). Here, g 
denotes an arbitrary real-valued function on lattices. Choosing g=A,, results in 
SIVP. 


Definition 2.8. An instance of GIVP% is given by an n-dimensional lattice L. 
The goal is to output a set of n linearly independent lattice vectors of length at most 


y(n): (L). 


Another useful (and even less standard) lattice problem is the following. We call 
it the discrete Gaussian sampling problem (DGS). As before, g denotes an arbitrary 
real-valued function on lattices. 


Definition 2.9. An instance of DGS, is given by an n-dimensional lattice L 
and a number r > g(L). The goal is to output a sample from Dz. 
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We also consider a variant of the closest vector problem (which is essentially 
what is known as the bounded distance decoding problem [Liu et al. 2006]): For an 
n-dimensional lattice L, and some d > 0, we say that an algorithm solves CVP_ a 
if, given a point x € R” whose distance to L is at most d, the algorithm finds the 
closest lattice point to x. In this article d will always be smaller than 4,(L)/2 and 
hence the closest vector is unique. 


The Smoothing Parameter. We make heavy use of a lattice parameter known as 
the smoothing parameter [Micciancio and Regev 2007]. Intuitively, this parameter 
provides the width beyond which the discrete Gaussian measure on a lattice behaves 
like a continuous one. The precise definition is the following. 


Definition 2.10. For an n-dimensional lattice L and positive real € > 0, we 
define the smoothing parameter ne(L) to be the smallest s such that p1/s(L* \ {0}) 
= €; 


In other words, ņne(L) is the smallest s such that a Gaussian measure scaled by 
1/s on the dual lattice L* gives all but €/(1 + €) of its weight to the origin. We 
usually take € to be some negligible function of the lattice dimension n. Notice that 
P1/s(L* \ {0}) is a continuous and strictly decreasing function of s. Moreover, it 
can be shown that lims—o 01/;(L* \ {0}) = œ and lims—o p1/s(L* \ {0} = 0. So, 
the parameter ne(L) is well defined for any € > 0, and € > ne(L) is the inverse 
function of s +> p1/s(L* \ {0}). In particular, ne(L) is also a continuous and strictly 
decreasing function of e€. 

The motivation for this definition (and the name “smoothing parameter’) comes 
from the following result, shown in Micciancio and Regev [2007] (and included here 
as Claim 3.8). Informally, it says that if we choose a ‘random’ lattice point from an n- 
dimensional lattice L and add continuous Gaussian noise v, for some s > ne(L) then 
the resulting distribution is within statistical distance e€ of the ‘uniform distribution 
on R”’. In this article, we show (in Claim 3.9) another important property of this 
parameter: for s > /2 Ne(L), if we sample a point from Dz „s and add Gaussian noise 
Vs, we obtain a distribution whose statistical distance to a continuous Gaussian v z, 
is at most 4€. Notice that v z, is the distribution one obtains when summing two 
independent samples from v,. Hence, intuitively, the noise v, is enough to hide the 
discrete structure of Dzs. 

The following two upper bounds on the smoothing parameter appear in Miccian- 
cio and Regev [2007]. 


LEMMA 2.11. For any n-dimensional lattice L, ne(L) <./n/d\(L*) where 
e=2~. 


LEMMA 2.12. For any n-dimensional lattice L and € >Q, 
In(2n(1 + 1/e)) 
nL) < Fecal) - Àn(L). 


In particular, for any superlogarithmic function w(logn), nem (L) < Vælogn) - 
Àn(L) for some negligible function e(n). 


We also need the following simple lower bound on the smoothing parameter. 
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CLAIM 2.13. For any lattice L and any € > 0, 


ibys /In 1/e 1 x jln 1/e , An(L) 
T ài(L*) T n 


In particular, for any e(n) = o(1) and any constant c >Q, nem (L) > c/à(L*) > 
CÀn(L)/n for large enough n. 


PROOF. Letv € L* bea vector of length 4,(L*) and let s = ne(L). Then, 


€ = prys(L* \ {0}) > prjs(v) = exp (— (s41 (L*))’). 


The first inequality follows by solving for s. The second inequality is by 
Lemma 2.3. 


The Fourier Transform. We briefly review some of the important properties of 
the Fourier transform. In the following, we omit certain technical conditions as 
these will always be satisfied in our applications. For a more precise and in-depth 
treatment, see, for example, Ebeling [2002]. The Fourier transform of a function 
h:R" — Cis defined to be 


hw) = f haxe? =w dx, 


From the definition, we can obtain two useful formulas; first, if A is defined by 
h(x) = g(x + v) for some function g and vector v then 


how) = e7 O a (w). (9) 

Similarly, if A is defined by h(x) = e”! œY g(x) for some function g and vector v 
then 

hw) = &(w —v). (10) 


Another important fact is that the Gaussian is its own Fourier transform, that is, 
ô = p. More generally, for any s > 0 it holds that p; = s” pı /s- Finally, we will use 
the following formulation of the Poisson summation formula. 


LEMMA 2.14. (POISSON SUMMATION FORMULA). For any lattice L and any 
function f : R" > C, 


f(L) = det(L*) F (L*). 


3. Main Theorem 


Our main theorem is the following. The connection to the standard lattice problems 
GAPSVP and SIVP will be established in Section 3.3 by polynomial time reductions 
to DGS. 


THEOREM 3.1 (MAIN THEOREM). Let € =€(n) be some negligible function of 
n. Also, let p = p(n) be some integer and a= a(n) € (0, 1) be such that ap > 2,/n. 
Assume that we have access to an oracle W that solves LWE, w, given a poly- 
nomial number of samples. Then, there exists an efficient quantum algorithm for 


DGS JF-n.(L)/0" 
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PROOF. The input to our algorithm is an n-dimensional lattice L and a number 
r>J/2n - n(L)/a. Our goal is to output a sample from Dz ,. Let r; denote r - 
(ap/./n)'. The algorithm starts by producing n° samples from Dz „,, where c is the 
constant from the iterative step lemma, Lemma 3.3. By Claim 2.13, r3n > 2r > 
2?"2,,(L), and hence we can produce these samples efficiently by the procedure 
described in the bootstrapping lemma, Lemma 3.2. Next, fori = 3n,3n—1,...,1 
we use our n° samples from Dz _,, to produce n° samples from Dz ,,_,. The procedure 
that does this, called the iterative step, is the core of the algorithm and is described 
in Lemma 3.3. Notice that the condition in Lemma 3.3 is satisfied since for all 
i> 1,r; > rı =rap/ Jn > V2pn-(L). At the end of the loop, we end up with n° 
samples from DŁ = DŁ, and we complete the algorithm by simply outputting 
the first of those. 


3.1. BOOTSTRAPPING. 


LEMMA 3.2 (BOOTSTRAPPING). There exists an efficient algorithm that, given 
any n-dimensional lattice L andr > 27"2,(L), outputs a sample froma distribution 
that is within statistical distance 2~° of D; ,. 


PROOF. By using the LLL basis reduction algorithm [Lenstra et al. 1982], we 
obtain a basis for L of length at most 2”1,,(L) and let P(L) be the parallelepiped 
generated by this basis. The sampling procedure samples a vector y from v, and 
then outputs y — (y mod P(L)) € L. Notice that ||y mod P(L)|| < diam(P(L)) < 
n2” Àn(L). 

Our goal is to show that the resulting distribution is exponentially close to Dz. 
By Lemma 2.5, all but an exponentially small part of Dz , is concentrated on points 
of norm at most ./nr. So consider any x € L with ||x|| < ./nr. By definition, the 
probability given to it by Dz, is p,(x)/p,(L). By Lemma 2.14, the denominator 
is o (L) = det(L*) -r" p,/,(L*) = det(L*) -r” and hence the probability is at most 
pr (x)/(det(L*)-r”) = det(L)v,(x). On the other hand, by Claim 2.1, the probability 
given to x € L by our procedure is 


J v, (y)dy > (1 — 27) det(L)v,(x). 
x+P(L) 


Together, these facts imply that our output distribution is within statistical distance 
272% of Dz». 


3.2. THE ITERATIVE STEP 


LEMMA 3.3 (THE ITERATIVE STEP). Lete = e(n) be a negligible function, a = 
a(n) € (0, 1) be a real number, and p = p(n) > 2 be an integer. Assume that we have 
access to an oracle W that solves LWE, w, given a polynomial number of samples. 
Then, there exists a constant c > 0 and an efficient quantum algorithm that, given 
any n-dimensional lattice L, a number r > J/2pne(L), and n° samples from Diy, 
produces a sample from Dy + jyap): 


Note that the output distribution is taken with respect to the randomness (and 
quantum measurements) used in the algorithm, and not with respect to the input 
samples. In particular, this means that from the same set of n° samples from Dz, 
we can produce any polynomial number of samples from Dz . ap): 
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PROOF. The algorithm consists of two main parts. The first part is shown in 
Lemma 3.4. There, we describe a (classical) algorithm that using W and the samples 
from Dz, solves CVP L*,ap/(2r)" The second part is shown in Lemma 3.14. There, 
we describe a quantum algorithm that, given an oracle that solves CVP}. ap (Br)? 
outputs a sample from Dy, n/p): This is the only quantum component in this 
article. We note that the condition in Lemma 3.14 is satisfied since by Claim 2.13, 


ap/(V/2r) < 1/ne(L) < Ay(L*)/2. 


3.2.1. From Samples to CVP. Our goal in this section is to prove the 
following. 


LEMMA 3.4 (FIRST PART OF ITERATIVE STEP). Let €=€(n) be a negligible 
function, p = p(n) > 2 be an integer, and a = a(n) € (0, 1) be a real number. As- 
sume that we have access to an oracle W that solves LWE, w, given a polynomial 
number of samples. Then, there exist a constant c > Q and an efficient algorithm 
that, given any n-dimensional lattice L, a number r > J/2pn.-(L), and n° samples 
from Dr, solves CVP rs on Vr): 


For an n-dimensional lattice L, some 0 < d <2,(L)/2, and an integer p > 2, we 
say that an algorithm solves CVPP, if, given any point x € R” within distance d of 


L, it outputs L~'kz(x) mod p € Z’, the coefficient vector of the closest vector to 
x reduced modulo p. We start with the following lemma, which shows a reduction 


from CVP; 4 to CVPP). 


LEMMA 3.5 (FINDING COEFFICIENTS MODULO p IS SUFFICIENT). There exists 
an efficient algorithm that given a lattice L, a number d < à; (L)/2 and an integer 


p = 2, solves CVP; a given access to an oracle for CVPP. 


PROOF. Our input is a point x within distance d of L. We define a sequence of 
points xj = X, X2, X3, ... as follows. Let a; = L~'kz(x;) € Z” be the coefficient 
vector of the closest lattice point to x;. We define x;,; = (x; — L(a; mod p))/p. 
Notice that the closest lattice point to x;,; is L(a; — (a; mod p))/p € L and hence 
aj41 = (a; — (a; mod p))/p. Moreover, the distance of xX;+; from L is at most d/p'. 
Also note that this sequence can be computed by using the oracle. 

After n steps, we have a point x,,.; whose distance to the lattice is at most d/p”. 
We now apply an algorithm for approximately solving the closest vector problem, 
such as Babai’s nearest plane algorithm [Babai 1986]. This yields a lattice point 
La within distance 2” - d/p” <d <A,(L)/2 of X,41. Hence, La is the lattice point 
closest to X,4; and we managed to recover a, +; = a. Knowing a, anda, mod p 
(by using the oracle), we can now recover a, = pan+ı + (a, mod p). Continuing 
this process, we can recover a,_1, An—2, .--, a1. This completes the algorithm since 
La, is the closest point to x; = x. 


As we noted in the proof of Lemma 3.3, for our choice ofr, ap/(J/2r)<d 1(L*)/2. 
Hence, in order to prove Lemma 3.4, it suffices to present an efficient algorithm 


for CVP” ap Viry We do this by combining two lemmas. The first, Lemma 3.7, 


shows an algorithm W” that, given samples from As, w, for some (unknown) £ < g, 
outputs s with probability exponentially close to 1 by using W as an oracle. Its 
proof is based on Lemma 3.6. The second, Lemma 3.11, is the main lemma of this 
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subsection, and shows how to use W” and the given samples from Dz, in order to 


(p) 
solve CVP ap IEY 


LEMMA 3.6 (VERIFYING SOLUTIONS OF LWE). Let p = p(n) > 1 be some inte- 
ger. There exists an efficient algorithm that, givens’ and samples from Ag wy, for some 
(unknown) s € Z” and a < 1, outputs whether s = s' and is correct with probability 
exponentially close to 1. 


We remark that the lemma holds also for all œ < O (vlog n) with essentially the 
same proof. 


PROOF. The idea is to perform a statistical test on samples from A, y, that 
checks whether s =s’. Let £ be the distribution on T obtained by sampling (a, x) 
from As y, and outputting x — (a,s’)/pe¢T. The algorithm takes n samples 
Y1, ---, Yn from £. It then computes z := Ya cos(27y;). If z > 0.02, it de- 
cides that s = s’, otherwise it decides that s Æ s’. 

We now analyze this algorithm. Consider the distribution €. Notice that it be 
obtained by sampling e from Ya, sampling a uniformly from Z7, and outputting 
e + (a,s — s')/p € T. From this, it easily follows that if s=s’, € is exactly Wy. 
Otherwise, if s 4 s’, we claim that £ has a period of 1/k for some integer k > 2. 
Indeed, let j be an index on which s; 4 s4. Then, the distribution ofa j(s;—s';) mod p 
is periodic with period ged(p, sj—s4) < p. This clearly implies that the distribution 
of aj(s; = 9) /p mod 1 is periodic with period 1/k for some k > 2. Since a sample 
from € can be obtained by adding a sample from aj(sj — si)/ p mod 1 and an 
independent sample from some other distribution, we obtain that € also has the 
same period of 1/k. 

Consider the expectation* 


1 


1 
f= Exp[cos(27y)] = / cos(27ry)é(y)dy = Re J exp Qr)». 
y~ 0 0 


First, a routine calculation shows that for € = Ya, Ž = exp (—xa?), which is at 
least 0.04 for æ < 1. Moreover, if Ẹ has a period of 1/k, then 


1 


1 
Í exp (2miy)é(y)dy = Í exp lnot ))E(y)dy 


1 
= exp (2ni/k) Í exp 2riy)E y)dy 


which implies that if k > 2 then Z = 0. We complete the proof by noting that by 
the Chernoff bound, |z — Z| < 0.01 with probability exponentially close to 1. 


LEMMA 3.7 (HANDLING ERROR Wg FOR B < @). Let p = p(n)>2 be some 
integer anda=a(n) € (0, 1). Assume that we have access to an oracle W that 
solves LWE, w, by using a polynomial number of samples. Then, there exists an 


4 We remark that this expectation is essentially the Fourier series of £ at point 1 and that the following 
arguments can be explained in terms of properties of the Fourier series. 
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efficient algorithm W’ that, given samples from As y, for some (unknown) B < a, 
outputs s with probability exponentially close to 1. 


PROOF. The proof is based on the following idea: by adding the right amount of 
noise, we can transform samples from A, y, to samples from As y, (or something 
sufficiently close to it). Assume that the number of samples required by W is 
at most n° for some c>0. Let Z be the set of all integer multiplies of n7% a? 
between 0 and a. For each y € Z, Algorithm W’ does the following n times. It 
takes n° samples from As y, and adds to the second element of each sample a 
noise sampled independently from Y /;. This creates n° samples taken from the 
distribution Ag y Siw It then applies W and obtains some candidate s’. Using 


Lemma 3.6, it checks whether s’ = s. If the answer is yes, it outputs s’; otherwise, 
it continues. 

We now show that W’ finds s with probability exponentially close to 1. By 
Lemma 3.6, if W’ outputs some value, then this value is correct with probability 
exponentially close to 1. Hence, it is enough to show that in one of the iterations, W’ 
outputs some value. Consider the smallest y € Z such that y > a? — ?. Clearly, 


y <a? — B? +n-~a’. Define a’ = y p2 + y. Then, 


a <a’ < ya? +n a <(14+n-*)a. 


By Claim 2.2, the statistical distance between W, and Yw is at most 9n ~2¢ Hence, 
the statistical distance between n° samples from Wy and n° samples from Yw is at 
most 9n~°. Therefore, for our choice of y, W outputs s with probability at least 
1 —9n~¢/2 — 2-2") > 1, The probability that s is not found in any of the n calls 
to W is at most 27”. 


For the analysis of our main procedure in Lemma 3.11, we will need to following 
claims regarding Gaussian measures on lattices. On first reading, the reader can just 
read the statements of Claim 3.8 and Corollary 3.10 and skip directly to Lemma 3.11. 
All claims show that in some sense, when working above the smoothing parameter, 
the discrete Gaussian measure behaves like the continuous Gaussian measure. We 
start with the following claim, showing that above the smoothing parameter, the 
discrete Gaussian measure is essentially invariant under shifts. 


CLAIM 3.8. For any lattice L, c € R”, € >Q, andr > ne(L), 
o(L +e) er” det(L*5)(1 + €). 


PROOF. Using the Poisson summation formula (Lemma 2.14) and the assump- 
tion that p;/-(L* \ {0} < €, 


prL+o= >) pato =} m 


xeL xeL 


= det(L") >” oy) 


yeL* 

= r” det(L*) D exp (277i (c, y))p1/r(Y) 
yeL* 

=r" det(L*)(1 + ©). 
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The following claim (which is only used to establish the corollary following it) 
says that when adding a continuous Gaussian of width s to a discrete Gaussian 
of width r, with both r and s sufficiently greater than the smoothing parameter, 
the resulting distribution is very close to a continuous Gaussian of the width we 
would expect, namely /r? + s?. To get some intuition on why we need to assume 
that both Gaussians are sufficiently wide, notice for instance that if the discrete 
Gaussian is very narrow, then it is concentrated on the origin, making the sum have 
width s. Also, if the continuous Gaussian is too narrow, then the discrete structure 
is still visible in the sum. 


CLAIM 3.9. Let L be a lattice, let u € R” be any vector, letr, s > 0 be two reals, 
and let t denote ~r? + s?. Assume that rs/t =1/./1/r2 + 1/s? > n-(L) for some 


E< h. Consider the continuous distribution Y on R” obtained by sampling from 
DLt+u. and then adding a noise vector taken from vs. Then, the statistical distance 
between Y and v, is at most 4e. 


PROOF. The probability density function of Y can be written as 


Y(x) Y prly)es(x —y) 
V= ant AA PYP- y 
s"pr(L +u) yeL+u 
1 
= Ti Lan exp (—2(ly/r I? + Il — y)/s l? 
soL Fu) a (-x( )) 
1 r? +s? 7 2 1 5 
== e E x 
s”p,(L + u) 2 xo ( ( r2.s ly r2 + g2 | T r2 + s2 l l )) 
ld 1 re+s? a 2 
= exp (- 5 sim?) 5 D exp (-( EF fy xl") 
r2+s2 s"p(L +u) ay r2. 52 r2 + s2 
1 rs r/t)?x— L 
S Pr,—u(L) 
1 rsJLG/DL aT 
= Lpo Pea 
s Pr,—u(L*) 
t/rs)" PTE rite L* 
= /,;(x)/t” - (t/rs) Prs/t,r/t?x—ul 3 (11) 


(l/r) pr, —a(L*) 


where in the next-to-last equality we used Lemma 2.14. Using Eq. (9), 


Prsjrarjnpx—u(W) = exp (—2ari ((r/t)°X — u, W)) + s/t)" pryrs(W), 
Pr,—u(W) = exp (277i (u, w)) - r” p1/,(w). 


Hence, 


|i = C/18)" arson alL)| < pryrsL* \ 10) < € 


ji = A/S] < pyr(L*\ 0) < € 
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where the last inequality follows from 1/r < t/rs. Hence, the quotient in (11) is 

between (1 — €)/(1 + €) > 1 — 2e and (1 + €)/(1 — €) < 1 + 4e. This implies that, 
IY (x) — pr(x)/t"| < p,(x)/t" - 4e. 

We complete the proof by integrating over R”. 


COROLLARY 3.10. Let L be a lattice, let z, u € R” be vectors, and letr, œ > 0 
be two reals. Assume that 1/4/1/r? + (\\z||/a)? = ne(L) for some € < 5. Then, the 
distribution of (z, V) +e where v is distributed according to Dt 4y,, and e is anormal 


variable with mean 0 and standard deviation a/ /2n, is within statistical distance 
4e of anormal variable with mean 0 and standard deviation y (r ||z||)2 + a2 / v27. 
In particular, since statistical distance cannot increase by applying a function, the 
distribution of (z, v) + e mod 1 is within statistical distance 4€ of Y Taa? 

PROOF. We first observe that the distribution of (z, v) + e is exactly the same 
as that of (z, v + h) where h is distributed as the continuous Gaussian v4 /\\z)|. Next, 
by Claim 3.9, we know that the distribution of v + h is within statistical distance 


4e of the continuous Gaussian v /z TOJI Taking the inner product of this con- 


tinuous Gaussian with z leads to a normal distribution with mean 0 and standard 
deviation af (rliz) + a2 / J27, and we complete the proof by using the fact that 
statistical distance cannot increase by applying a function (inner product with z in 
this case). 


LEMMA 3.11 (MAIN PROCEDURE OF THE FIRST PART). Let € =€(n) be a neg- 
ligible function, p = p(n) > 2 be an integer, anda = a(n) € (0, 1) be a real number. 
Assume that we have access to an oracle W that for all B < a, finds s given a poly- 
nomial number of samples from As y, (without knowing p). Then, there exists an 
efficient algorithm that given an n-dimensional lattice L, a number r > /2pn-(L), 
) 


A . (p 
and a polynomial number of samples from Dz ,, solves CVP ie apn Jas 


PROOF. We describe a procedure that given x within distance ap/(/2r) 
of L*, outputs samples from the distribution As y, for some <œ where 
s =(L*)~!«,*(x) mod p. By running this procedure a polynomial number of times 
and then using W, we can find s. 

The procedure works as follows. We sample a vector ve L from Dz, and let 
a = L~'v mod p. We then output 


(a, (x, V)/p +e mod 1) (12) 


where e € R is chosen according to a normal distribution with standard deviation 
a/(2./7). We claim that the distribution given by this procedure is within negligible 
statistical distance of A, y, for some B < æ. 

We first notice that the distribution of a is very close to uniform. Indeed, 
the probability of obtaining each a € Z7, is proportional to p,(pL + La). Using 
ne(pL) = pne(L) <r and Claim 3.8, the latter is (r/p)”" det(L*)(1 + €), which im- 
plies that the statistical distance between the distribution of a and the uniform 
distribution is negligible. 

Next, we condition on any fixed value of a and consider the distribution of the 
second element in (12). Define x/ =x — «,»(x) and note that ||x’|| <ap/(/2r). 
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Then, 
(x, v)/p +e mod 1 = (x’/p, v) +e + (krx), V)/p mod 1. 
Now, 
(er), v) = (L*)' ki +(x), Ly) 


since L7! = (L*)’. In words, this says that the inner product between «z+(x) and 
v (and in fact, between any vector in L* and any vector in L) is the same as the 
inner product between the corresponding coefficient vectors. Since the coefficient 
vectors are integer, 


(k(x), v) mod p = (s, a) mod p 
from which it follows that (kz»(x), v)/p mod 1 is exactly (s, a)/p mod 1. 

We complete the proof by applying Corollary 3.10, which shows that the distri- 
bution of the remaining part (x’/p, v) + e is within negligible statistical distance 
of Yg for B = V(r lx'\l/p)? + a2/2 < a, as required. Here we used that the dis- 
tribution of v is Dp7+47a,, (Since we are conditioning on a), the distribution of e is 
normal with mean 0 and standard deviation (a//2)//2z, and that 


1/1/72 + WZIX'I/ poe? > r/V2 > ne(pL). 


3.2.2. FROM CVP TO SAMPLES. In this section, we describe a quantum proce- 
dure that uses a CVP oracle in order to create samples from the discrete Gaussian 
distribution. We assume familiarity with some basic notions of quantum computa- 
tion, such as (pure) states, measurements, and the quantum Fourier transform. See, 
for example, Nielsen and Chuang [2000] for a good introduction. For clarity, we 
often omit the normalization factors from quantum states. 

The following lemma shows that we can efficiently create a “discrete quantum 
Gaussian state” of width r as long as r is large enough compared with à„(L). It 
can be seen as the quantum analogue of Lemma 3.2. The assumption that L C Z” 
is essentially without loss of generality since a lattice with rational coordinates can 
always be rescaled so that L C Z”. 


LEMMA 3.12. There exists an efficient quantum algorithm that, given an n- 
dimensional lattice L C Z” and r >27"1,(L), outputs a State that is within lz 
distance 2-2 of the normalized state corresponding to 


> V pr (x)|x) = a Pa, œX). (13) 
xeL xeL 


PROOF. We start by creating the “one-dimensional Gaussian state” 


Jr S 
>. et «/(V2r)) |x). (14) 


This state can be created efficiently using a technique by Grover and Rudolph [2002] 
who show that in order to create such a state, it suffices to be able to compute for 
any a,b € {—d/nr,..., Jnr} the sum 37?_, e~*°/" to within good precision. 
This can be done using the same standard techniques used in sampling from the 
normal distribution. 
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Repeating the procedure described above n times, creates a system whose state 
is the n-fold tensor product of the state in Eq. (14), which can be written as 


Py G0Ix). 


Since Z” N ./nrB, C {—/nr,...,./nr}", Lemma 2.5 implies that this state is 
within £ distance 2~°”) of 


DS var 0x) (15) 


xez” 


and hence for our purposes we can assume that we have generated the state in 
Eq. (15). 

Next, using the LLL basis reduction algorithm [Lenstra et al. 1982], we obtain a 
basis for L of length at most 2”à„(L) and let P(L) be the parallelepiped generated 
by this basis. We now compute in a new register x mod P(L) and measure it. Let 
y € P(L) denote the result and note that ||y|| < diam(P(L)) < n2” àn(L). The state 
we obtain after the measurement is 


X pawl). 


xeL+y 


Finally, we subtract y from our register, and obtain 


X pyar (x + yDIX). 


xeL 


Our goal is to show that this state is within £, distance 27°% of the one in Eq. (13). 
First, by Lemma 2.5, all but an exponentially small part of the £2 norm of the state 
in Eq. (13) is concentrated on points of norm at most y/n - r. So consider any x € L 
with ||x|| < yn : r. The amplitude squared given to it in Eq. (13) is p,(x)/p,(L). 
By Lemma 2.14, the denominator is p,(L) = det(L*) - r” pı; (L*) > det(L*) r” 
and hence the amplitude squared is at most p,(x)/(det(L*) - r”) = det(L)v, (x). 

On the other hand, the amplitude squared given to x by our procedure is p,(x + 
y)/e,(L + y). By Lemma 2.14, the denominator is 


pL +y) = det(L*) r” $ eiD py .@) < A +272) det(L*) r”. 


zeL* 


To obtain this inequality, first note that by the easy part of Lemma 2.3, 
A\(L*) > 1/A,(L) > /n/r, and then apply Lemma 2.5. Moreover, by Claim 2.1, 
the numerator is at least (1 — 2~°”),(x). Hence, the amplitude squared given to 
x is at least (1 — 278%) det(L)v,(x), as required. 


For a lattice L and a positive integer R, we denote by L/R the lattice obtained 
by scaling down L by a factor of R. The following technical claim follows from 
the fact that almost all the mass of p is on points of norm at most ./n. 


CLAIM 3.13. Let R > 1 be an integer and L be an n-dimensional lattice 
satisfying (L) > 2./n. Let P(L) be some basic parallelepiped of L. Then, the £2 
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distance between the normalized quantum states corresponding to 
l= J, pw[xmod P(L)), and 
xeL/R,||x\|</n 


192) = $ pWixmod PL) = YY) Yo p(x—y)Ix) 


xEL/R xeL/RAP(L) yeL 
is 2-28). 


PROOF. We think of |v) and |92) as vectors in R”-dimensional space. Let Z be 
the £2 norm of |). In the following we show that the £, distance between |9} and 
lð) is at most 2-2 Z. This is enough to establish that the £2 distance between the 
normalized quantum states corresponding to |v)) and |92} is exponentially small. 

We first obtain a good estimate of Z. Since A,(L) > 2./n, each “ket” in the 
definition of |%,) appears in the sum only once, and so 


Z= J. lx)? = p(V2L/R 1 V2nB,). 


xeL/R,||x\|<V/n 
By applying Lemma 2.5 to the lattice /2L/R, we obtain that 
(1 —2-*")p(V2L/R) < Z < p(W2L/R). 


We complete the proof with an upper bound on the £2 distance between the two 
vectors. Using the monotonicity of norms, 


I) — |82) Ilo < Ie) — 182) Ih 


= }, pw) 


xeL/R,||x|>J/n 
<2-7"p(L/R) (by Lemma 2.5) 
< 27772"? o(/2L/R) (by Lemma 2.4) 
< 2-"p(/2L/R). 


We now prove the main lemma of this subsection. 


LEMMA 3.14 (SECOND PART OF ITERATIVE STEP). There exists an efficient 
quantum algorithm that, given any n-dimensional lattice L,a number d < d,(L*)/2, 
and an oracle that solves CVP «a, outputs a sample from Di faa): 


PROOF. By scaling, we can assume without loss of generality that d = y/n. Let 
R > 2°"},,(L*) be a large enough integer. We can assume that log R is polynomial 
in the input size (since such an R can be computed in polynomial time given the 
lattice L). Our first step is to create a state exponentially close to 


X p(x — y)IX). (16) 


xeL*/ROP(L*) yeL* 


This is a state on n log R qubits, a number that is polynomial in the input size. To 
do so, we first use Lemma 3.12 with r = 1/,/2 and the lattice L*/R to create the 


Journal of the ACM, Vol. 56, No. 6, Article 34, Publication date: September 2009. 


34:30 ODED REGEV 


state 


Y= ewh). 


xeL*/R 
By Lemma 2.5, this is exponentially close to 
Yo px). 
xeL*/R,||x\|</n 
Next, we compute x mod P(L*) in a new register and obtain 
> p(x)|x, x mod P(L*)). 
xeL*/R, |x| <7 


Using the CVP oracle, we can recover x from x mod P(L*). This allows us to 
uncompute the first register and obtain 


2 p(x)|x mod P(L*)). 
xeL*/R,|x||<Va 


Using Claim 3.13, this state is exponentially close to the required state (16). 
In the second step, we apply the quantum Fourier transform. First, using the 
natural mapping between L*/R N P(L*) and Zh, we can rewrite (16) as 


XO $ p(L*s/R — L*r)|s). 
seZh reZ” 


We now apply the quantum Fourier transform on Zh. We obtain a state in which 
the amplitude of |t} for t € Zk is proportional to 


XO $ p(L*s/R — L*r) exp(2zi(s, t)/R) 


seZ}, reZ” 


X p(L*s/R) exp(2zi (s, t)/R) 


sez” 
= > p(x) exp(2zi ((L*)~'x, t)) 
xeL*/R 
= ` p(x) exp(277/ (x, Lt)) 
xeL*/R 
=det(RL) Ý ply — Lt) 
yeRL 


where the last equality follows from Lemma 2.14 and Eq. (10). Hence, the resulting 
state can be equivalently written as 
Y= ety — xx). 


xeP(RL)NL yeRL 


Notice that à (RL) = Rà (L) > R/d,(L*) = 23” Hence, we can apply Claim 3.13 
to the lattice RL, and obtain that this state is exponentially close to 


tP p(x)|x mod P(RL)). 


xeL, ||x\|<./7 
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We measure this state and obtain x mod P(RL) for some vector x with ||x|] <./n. 
Since x mod P(RL) is within y/n of the lattice RL, and 4,(RL) > 2°”, we can 
recover x by using, say, Babai’s nearest plane algorithm [Babai 1986]. The output 
of the algorithm is x. 

We claim that the distribution of x is exponentially close to D; ;,. yz. Indeed, the 


probability of obtaining any x € L, ||x|| < ./n is proportional to p(x = Pij JZ). 
It remains to notice that by Lemma 2.5, all but an exponentially small fraction of 
the probability distribution D; 4; 7 is on points of norm less than Jn. 


3.3. STANDARD LATTICE PROBLEMS. We now complete the proof of the main 
theorem by reducing the standard lattice problems GAPS VP and SIVP to DGS. We 
start with SIVP. The basic idea of the reduction is simple: we call the DGS oracle 
enough times. We show that with high probability, there are n short linearly inde- 
pendent vectors among the returned vectors. We prove this by using the following 
lemma, which appeared in the preliminary version of [Micciancio and Regev 2007]. 
We include the proof since only a proof sketch was given there. 


LEMMA 3.15. Let L be an n-dimensional lattice and let r be such that 
r >J/2n.(L) where € < + b . Then, for any subspace H ANE at most n — 1 


the probability that x ¢ H where x is chosen from Dz,» is at least + i0 


PROOF. Assume without loss of generality that the vector (1, 0,...,0) is or- 
thogonal to H. Using Lemma 2.14, 


Exp [exp ils 


x~D_ + 


~ 7 2 exp (- mV 21/1)? )exp (—2(x2/r)’) +++ exp (—2(%n/r)”) 
xeL 


det(L*) r” 2 2 2 
= SS ep (-rtyi/ VD )exp (~r y2?) -exp (xyr) 
V2p,(L) 2 l = 

_ deh") ý 
P./2jr 
RR 
_ deh") r" Himes 
€). 
= xDD) 
By using Lemma 2.14 again, we see that p,(L) = det(L*)r"p1/,(L*) = det(L*)r”. 
Therefore, the expectation above is at most A (1 + €) <0.9 and the lemma fol- 
lows. 


(L") 


COROLLARY 3.16. Let L be an n-dimensional lattice and let r be such that 
r >/2n-(L) where e < +. Then, the probability that a set of n? vectors chosen 
independently from DL p contains no n linearly independent vectors is exponentially 
small. 


PROOF. Let x;,...,X,2 be n? vectors chosen independently from Dz.. For 
i=1,...,n, let B; be the event that 
dim span(x;, . . . , X@—1),) = dim span(xX),..., Xin) <7. 
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Clearly, if none of the B;’s happens, then dim span(x;, ..., X,2) = n. Hence, in 
order to complete the proof it suffices to show that for all i, Pr[B;] < 2720). So 
fix some 7, and let us condition on some fixed choice of X1, ..., X¢—1)n such that 
dim span(x), . . . , X@—1)n) < n. By Lemma 3.15, the probability that 

X(i—1)n+1,-+-+, Xin € dim span(X), ..., X(—1)n) 


is at most (9/10)” = 27°% ., This implies that Pr[B;] < 272% , as required. 


In the following lemma we give the reduction from SIVP (in fact, GIVP) to 
DGS. It shows that under the assumptions of Theorem 3.1, there exists an efficient 
quantum algorithm for GIVP, Ann) ja: By Lemma 2.12, this algorithm also solves 
SIVP6 (n/a): 


LEMMA 3.17. For any € =€(n) < a and any (L) = /2n(L), there is a poly- 
nomial time reduction from GIVP3 jig to DGSọ. 


PROOF. As mentioned above, the idea of the reduction is to simply call the 
DGS oracle in an attempt to find short linearly independent vectors. One technical 
complication is that the function ¢ is not necessarily efficiently computable, and 
hence we do not know which parameter r to give the DGS oracle. The solution is 
easy: we just try many values ofr and take the shortest set of n linearly independent 
vectors found. 

We now present the reduction in detail. The input to the reduction is a lattice L. We 
first apply the LLL algorithm [Lenstra et al. 1982] to obtain n linearly independent 
vectors of length at most 2”à„ (L). Let S denote the resulting set, and let X, be the 
length of the longest vector in S. By construction we have A,(L) < Xn SPL ALY, 
For each i € {0,..., 2n} call the DGS oracle n? times with the pair (L,7;) where 
ri = 4,27‘, and let S; be the resulting set of vectors. At the end, look for a set of n 
linearly independent vectors in each of S, So, S1, ..., S27, and output the shortest 
set found. 

We now prove correctness. If g(L) > Ñ, then S is already shorter than 2,/ng(L) 
and so we are done. Otherwise, let i € {0, . . . , 2n} be such that g(L) </, ri <2@(L). 
Such ani must exist by Claim 2.13. By Corollary 3.16, S$; contains n linearly inde- 
pendent vectors with probability exponentially close to 1. Moreover, by Lemma 2.5, 
all vectors in S; are of length at most r;,/n < 2,/ng(L) with probability exponen- 
tially close to 1. Hence, our reduction outputs a set of n linearly independent vectors 
of length at most 2,/ng(L), as required. 


We now present the reduction from GAPSVP to DGS. We first define the decision 
version of the closest vector problem (GAPCVP) and a slight variant of it. 


Definition 3.18. An instance of GAPCVP,, is given by an n-dimensional lattice 
L, a vector t, and a number d > 0. In YES instances, dist(t, L) < d, whereas in NO 
instances, dist(t, L) > y(n) - d. 


Definition 3.19. An instance of GAPCVP, is given by an n-dimensional lattice 
L,avector t, and a number d > 0. In YES instances, dist(t, L) < d. In NO instances, 
A\(L) > y(n) -d and dist(t, L) > y(n) - d. 


In Goldreich et al. [1999], it is shown that for any y = y(n) > 1, there is a 
polynomial time reduction from GAPSVP,, to GapCVP, (see also Lemma 5.22 in 
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Micciancio and Regev [2007]). Hence, it suffices to show a reduction from GAPCVP” 
to DGS. This reduction is given in the following lemma. By using Lemma 2.11, we 
obtain that under the assumptions of Theorem 3.1 there exists an efficient quantum 
algorithm for GAPCVP,,,/.) (and hence also for GAPS VP (n/a))- 


LEMMA 3.20. Forany y =y(n) > 1, there is a polynomial time reduction from 
GAPCVP 09 7ye) to DGS rya) 


PROOF. The main component in our reduction is the NP verifier for COGAPC VP 
shown in Aharonov and Regev [2005]. In more detail, Aharonov and Regev [2005] 
present an efficient algorithm, call it V, whose input consists of an n-dimensional 
lattice L, a vector t, a number d > 0, anda sequence of vectors W1, ..., Wy in L* for 
some N = poly(n). When dist(t, L) < d, the algorithm is guaranteed to reject. When 
dist(t, L) > 100./nd, and w1, ..., Wy are chosen from the distribution Dz», 1/1004)» 
then the algorithm accepts with probability exponentially close to 1. 

The input to the reduction is an n-dimensional lattice L, a vector t, and a number 
d >0. We call the DGS oracle N times with the lattice L* and the value r to 
obtain vectors W1, ..., Wy E L*. We then apply V with L, t, d, and the vectors 
Wi,-.-, Wy. We accept if and only if V rejects. 

To prove correctness, notice first that in the case of a YES instance, dist(t, L) < d, 
and hence V must reject (irrespective of the w’s). In the case of a NO instance 

1 


we have that 797 > J/ny(n)/A\(L), and hence w;,..., Wy are guaranteed to be 


valid samples from Dz» 1/(100a). Moreover, dist(t, L) > 100./ny(n)d = 100./nd, 
and hence VY accepts with probability exponentially close to 1. 


4. Variants of the LWE problem 


In this section, we consider several variants of the LWE problem. Through a se- 
quence of elementary reductions, we prove that all problems are as hard as LWE. 
The results of this section are summarized in Lemma 4.4. 


LEMMA 4.1 (AVERAGE-CASE TO WORST-CASE). Letn, p> 1 be some integers 
and x be some distribution on Zp. Assume that we have access to a distinguisher W 
that distinguishes Ag, from U for a nonnegligible fraction of all possible s. Then 
there exists an efficient algorithm W' that for all s accepts with probability expo- 
nentially close to 1 on inputs from Ag, and rejects with probability exponentially 
close to 1 on inputs from U. 


PROOF. The proof is based on the following transformation. For any t € Z% 


: : -yn y yn r 
consider the function ft : Zi, x Zp > Zi, x Zp defined by 


fea, b) = (a, b + (a, t)). 


It is easy to see that this function transforms the distribution A, , into As+t,x- 
Moreover, it transforms the uniform distribution U into itself. 

Assume that for n~“'! of all possible s, the acceptance probability of W on in- 
puts from As, and on inputs from U differ by at least n~°. We construct W’ as 
follows. Let R denote the unknown input distribution. Repeat the following n+! 
times. Choose a vector t € Zi, uniformly at random. Then, estimate the acceptance 


probability of W on U and on f,(R) by calling W O(n7°*") times on each of the 
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input distributions. By the Chernoff bound, this allows us to obtain an estimate that 
with probability exponentially close to 1 is within tn“ /8 of the true acceptance 
probabilities. If the two estimates differ by more than n~°/2, then we stop and 
decide to accept. Otherwise, we continue. If the procedure ends without accepting, 
we reject. 

We now prove that W’ distinguishes A, , from U for all s. First, we claim 
that when R is U, the acceptance probability of W’ is exponentially close to 0. 
Indeed, in this case, f,(U) =U and therefore the two estimates that W’ performs 
are of the same distribution. The probability that the estimates differ by more than 
n-°/2 > 2-n~®/8 is exponentially small. Next, consider the case that R is Ag , 
for some s. In each of the n“'*! iterations, we are considering the distribution 
St(As,x) = As+t,, for some uniformly chosen t. Notice that the distribution of 
s + tis uniform on Z. Hence, with probability exponentially close to 1, in one of 
the n+! iterations, t is such that the acceptance probability of W on inputs from 
As+t,y and on inputs from U differ by at least n~®. Since our estimates are within 
+n~°/8, W” accepts with probability exponentially to 1. 


LEMMA 4.2 (DECISION TO SEARCH). Let n>1 be some integer, 2< 
p <poly(n) be a prime, and x be some distribution on Z,. Assume that we have 
access to procedure W that for all s accepts with probability exponentially close 
to 1 on inputs from As, and rejects with probability exponentially close to 1 on 
inputs from U. Then, there exists an efficient algorithm W' that, given samples 
from Ag,, for some s, outputs s with probability exponentially close to 1. 


PROOF. Let us show how W’ finds sı € Zp, the first coordinate of s. Finding the 
other coordinates is similar. For any k € Zp, consider the following transformation. 
Given a pair (a, b) we output the pair (a + (/,0,...,0),b +1- k) where l€ Zp 
is chosen uniformly at random. It is easy to see that this transformation takes the 
uniform distribution into itself. Moreover, if k = sı then this transformation also 
takes A, , to itself. Finally, if k Æ sı then it takes A, , to the uniform distribution 
(note that this requires p to be prime). Hence, using W, we can test whether k = s4. 
Since there are only p < poly(n) possibilities for sı we can try all of them. 


LEMMA 4.3 (DISCRETE TO CONTINUOUS). Letn, p > 1 be some integers, let 
be some probability density function on T, and let @ be its discretization to Zp. 
Assume that we have access to an algorithm W that solves LWE,g. Then, there 
exists an efficient algorithm W” that solves LWE pg. 


PROOF. Algorithm W’ simply takes samples from As, and discretizes the sec- 
ond element to obtain samples from A, g. It then applies W with these samples in 
order to find s. 


By combining the three lemmas above, we obtain 


LEMMA 4.4. Letn > 1 be an integer and 2 < p < poly(n) be a prime. Let 
be some probability density function on T and let $ be its discretization to Zp. 
Assume that we have access to a distinguisher that distinguishes A; from U for 
a non-negligible fraction of all possible s. Then, there exists an efficient algorithm 
that solves LWEp 4. 
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5. Public Key Cryptosystem 


We letn be the security parameter of the cryptosystem. Our cryptosystem is parame- 
terized by two integers m, p anda probability distribution x on Z. A setting of these 
parameters that guarantees both security and correctness is the following. Choose 
p > 2to be some prime number between n? and 2n? and let m = (1+€)(n+1) log p 
for some arbitrary constant € > 0. The probability distribution x is taken to be Yain) 
for a(n) = o(1/(./n log n)), that is, (n) is such that lim, a(n) -./nlogn = 0. 
For example, we can choose a(n) = 1/(./n log” n). In the following description, 
all additions are performed in Z,, i.e., modulo p. 


—Private key: Choose s € Z}, uniformly at random. The private key is s. 


—Public Key: For i = 1,...,m, choose m vectors aj,..., am cZ, indepen- 
dently from the uniform distribution. Also choose elements e1, ..., €m € Zp 
independently according to x. The public key is given by (a;, bj)”, where 
b; = (aj, S) + ei. 

—Encryption: In order to encrypt a bit, we choose a random set § uniformly 
among all 2” subsets of [m]. The encryption is ()0;-5 a;, )o;<5 bi) if the bit is 0 
and (Jes ai, LE] + J jes bi) if the bit is 1. 

—Decryption: The decryption of a pair (a, b) is 0 if b — (a, s) is closer to 0 than 
to L2] modulo p. Otherwise, the decryption is 1. 


Notice that with our choice of parameters, the public key size is O (mn log p) = 
O(n”) and the encryption process increases the size of a message by a factor of 
O(n log p) = O(n). In fact, it is possible to reduce the size of the public key to 
O(m log p) = O(n) by the following idea of Ajtai [2005]. Assume all users of the 


cryptosystem share some fixed (and trusted) random choice of a), ..., am. This 
can be achieved by, say, distributing these vectors as part of the encryption and 
decryption software. Then, the public key need only consist of bı, ... , bm. This 


modification does not affect the security of the cryptosystem. 

We next prove that under a certain condition on x, m, and p, the probability of 
decryption error is small. We later show that our choice of parameters satisfies this 
condition. For the following two lemmas, we need to introduce some additional 
notation. For a distribution x on Z, and an integer k > 0, we define x** as the 
distribution obtained by summing together k independent samples from x, where 
addition is performed in Z, (for k = 0 we define x*° as the distribution that is 
constantly 0). For a probability distribution ¢ on T we define #** similarly. For 
an element a € Z, we define |a| as the integer a if a € {0,1,..., cali and as the 
integer p — a otherwise. In other words, |a| represents the distance of a from 0. 
Similarly, for x € T, we define |x| as x for x € [0, 3] and as 1 — x otherwise. 


LEMMA 5.1 (CORRECTNESS). Letô > 0. Assume that for any k € {0, 1,..., m}, 


x* satisfies that 


P 
Pr, [ie 2 [22] 24s 
Then, the probability of decryption error is at most 6. That is, for any bit c € {0, 1}, 


if we use the protocol above to choose private and public keys, encrypt c, and then 
decrypt the result, then the outcome is c with probability at least 1 — 6. 
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PROOF. Consider first an encryption of 0. It is given by (a, b) fora = } jeg a; 


and 
b= bi =} (ans) + ei = (as) + Dei. 


ieS ieS ieS 
Hence, b— (a, s) is exactly 2o s €i. The distribution of the latter is peel, According 
to our assumption, | X`; e;| is less than L2] /2 with probability at least 1 — ô. In 


this case, it is closer to 0 than to |5] and therefore the decryption is correct. The 
proof for an encryption of 1 is similar. 


CLAM 5.2. For our choice of parameters it holds that for any k € 
{0,1,..., m}, 


be [ie < |2172] > 1—8(n) 


for some negligible function ô(n). 


PROOF. A sample from wrk can be obtained by sampling x,...,x, from 
Ya and outputting ey |px;| mod p. Notice that this value is at most k < 
m < p/32 away from X% pxi mod p. Hence, it is enough to show that 
l yy px; mod p| < p/16 with high probability. This condition is equivalent to 
the condition that | )“_, x; mod 1| < 1/16. Since )“*_, x; mod 1 is distributed as 


W and /k-a = o(1/./log n), the probability that | ee x; mod 1| < 1/16 is 
1 — ô(n) for some negligible function 6(7). 


In order to prove the security of the system, we need the following special case 
of the leftover hash lemma that appears in Impagliazzo and Zuckerman [1989]. We 
include a proof for completeness. 


CLAIM 5.3. Let G be some finite Abelian group and let l be some integer. For 
any | elements g1,..., gı € G consider the statistical distance between the uniform 
distribution on G and the distribution given by the sum of a random subset of 
&1,---, 81. Then, the expectation of this statistical distance over a uniform choice of 


21,---,) € G is at most \/|G|/2!. In particular, the probability that this statistical 
distance is more than x/|G|/2! is at most \/|G|/2!. 


PROOF. Fora choice g=(g1,..., g/) of l elements from G, let Pg be the dis- 
tribution of the sum of a random subsets of g1,..., 81, i-e., 


1 
Pah) = 57 |{b € 0, 1}’| Db: =A}. 


In order to show that this distribution is close to uniform, we compute its 2 norm, 
and note that it is very close to 1/|G |. From this it will follow that the distribution 
must be close to the uniform distribution. The £2 norm of Pg is given by 


D Path)? = Pr [> big: = D> ois: | 
heG 
<7 +R [Dee Leelee). 
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Taking expectation over g, and using the fact that for any b Æ b’, Prg[}) bigi = 
>= bigi] = 1/|G|, we obtain that 


1 1 
E P,(hy| < = + —. 
xp [X g( »}]< zi t IG] 


Finally, the expected distance from the uniform distribution is 


Exp [X ; |Pe(2) — 1/1GI|] < Exp [ie (>, (Pe(h) — 1/161?)"”] 
È g 


= VIG] Exp (5, Pe? — 1/161) "°] 
g 


1/2 
< VIG| (Ex [2 Phy] - 1/161) 
g 


2) ol 
— 2! z 


We now prove that our cryptosystem is semantically secure, that is, that it is 
hard to distinguish between encryptions of 0 and encryptions of 1. More precisely, 
we show that if such a distinguisher exists, then there exists a distinguisher that 
distinguishes between A, , andU for a non-negligible fraction of alls. If x = Ya and 
p <poly(n) is a prime, then by Lemma 4.4, this also implies an efficient (classical) 
algorithm that solves LWE, w,. This, in turn, implies, by Theorem 3.1, an efficient 
quantum algorithm for DGS 7; (7)/q- Finally, by Lemma 3.17, we also obtain 
an efficient quantum algorithm for SIVP6,,/.) and by Lemma 3.20, we obtain an 
efficient quantum algorithm for GAPS VPo(n/a). 


LEMMA 5.4 (SECURITY). For any € >Q and m > (1 + €)(n + 1) log p, if there 
exists a polynomial time algorithm W that distinguishes between encryptions of 0 
and 1 then there exists a distinguisher Z that distinguishes between Ag , and U for 
a non-negligible fraction of all possible s. 


PROOF. Let po(W) be the acceptance probability of W on input 
((a;, b;)/"_,, (a, b)) where (a, b) is an encryption of 0 with the public key (a;, b;)""_, 
and the probability is taken over the randomness in the choice of the private and 
public keys and over the randomness in the encryption algorithm. We define p;(W ) 
similarly for encryptions of 1 and let p,(W) be the acceptance probability of W 
on inputs ((a;, b;)7_,, (a, b)) where (a;, b;)/"., are again chosen according to the 
private and public keys distribution but (a, b) is chosen uniformly from Zi, x Zp. 
With this notation, our hypothesis says that | po(W) — pi(W)| > 4 for some c > 0. 


We now construct a W’ for which | po(W’) — Pu(W’)| > a By our hypothesis, 


either | po(W) — pa (W)| = z or |pı(W) — p,(W)| > z- In the former case, we 
take W’ to be the same as W. In the latter case, we construct W’ as follows. On 
input ((a;, b;)7_,, (a, b)), W’ calls W with ((a;, b;)/,, (a, po + b)). Notice that 
this maps the distribution on encryptions of 0 to the distribution on encryptions of 
1 and the uniform distribution to itself. Therefore, W’ is the required distinguisher. 

Fors € Zi let po(s) be the probability that W’ accepts on input ((a;, b;)/_,, (a, b)) 
where (a;, b;)/., are chosen from Ag ,, and (a, b) is an encryption of O with the 
public key (a;, b;)_,. Similarly, define p,,(s) to be the acceptance probability of W” 
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where (a;, b;)/_, are chosen from As, x, and (a, b) is now chosen uniformly at random 
from Zp x Zp. Our assumption on W” says that | Exp,[ po(s)] — Exp, [pa (9)]| > 


Define 
y 1 
= . 
4n" 


By an averaging argument we get that a fraction of at least I of the s are in Y. 
Hence, it is enough to show a distinguisher Z that distinguishes between U and 
As, for any sEY. 

In the following, we describe the distinguisher Z. We are given a distribution R 
that is either U or A, , for some s € Y. We take m samples (a;, b;)?"_, from R. Let 
Po((a;, b;)/_,) be the probability that W’ accepts on input ((a;, 5;)/”_,, (a, b)) where 
the probability is taken on the choice of (a, b) as an encryption of the bit O with 
the public key (a;, b;)?_,. Similarly, let p,,((a;, 5;)/_,) be the probability that W’ 
accepts on input ((a;, b;)/_,, (a, b)) where the probability is taken over the choice 
of (a, b) as a uniform element of Z}, x Zp. By applying W’ a polynomial number 
of times, the distinguisher Z estimates both po((a;, b;)7_,) and p,((a;, b;)7_,) up to 
an additive error of a. If the two estimates differ by more than Z accepts. 
Otherwise, Z rejects. 

We first claim that when R is the uniform distribution, Z rejects with high proba- 
bility. In this case, (a;, b;);"_, are chosen uniformly from Z}, x Zp. Using Claim 5.3 
with the group G = Zi, x Zp, we obtain that with probability exponentially close 
to 1, the distribution on (a, b) obtained by encryptions of 0 is exponentially close 
to the uniform distribution on Z, x Zp. Therefore, except with exponentially small 


probability, 


1 
2n” 


|po(s) — pu(s)| = 


16n°? 


[po((ar, DDE) = Pul(ar, bD < 272. 


Hence, our two estimates differ by at most zr +272% and Z rejects. 

Next, we show that if R is A,, for seY then Z accepts with probability 
1/poly(n). Notice that po(s) (respectively, p,(S)) is the average of po((a;, b;)/_)) 
(respectively, p,((a;, b;)/"_,)) taken over the choice of (a;, b;);"., from As, y. From 


| Ppo(S) — Puls)| = yb we obtain by an averaging argument that 


|Po((ai, bi)i=1) — Pu((ai, biz) 2 


E ~ 8ne 


over the choice of (a;, b;)/"_, from A, ,. Hence, with 


1 
8n° 


Z chooses such a (aj, 5; )? 


with probability at least 


1 


probability at least , and since our estimates are 


and Z 


= 
8n°? 
1 


aan» the difference between them is more than 


accurate to within 
accepts. 


1 
16n¢ 
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